BELL INEQUALITIES AND ENTANGLEMENT 

o 

(N 

£> ', REINHARD F. WERNER AND MICHAEL M. WOLF* 

Institute for Mathematical Physics, TU Braunschweig 
jf , , Mendelssohnstr. 3, 38102 Braunschweig, Germany 



(N 
> 

m 

On 
O 

O 



3 



X 



We discuss general Bell inequalities for bipartite and multipartite systems, emphasizing 
the connection with convex geometry on the mathematical side, and the communication 
aspects on the physical side. Known results on families of generalized Bell inequali- 
ties are summarized. We investigate maximal violations of Bell inequalities as well as 
states not violating (certain) Bell inequalities. Finally, we discuss the relation between 
Bell inequality violations and entanglement properties currently discussed in quantum 
information theory. 
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Mh. 1. Introduction and historical survey 



Quantum mechanics was born in a remarkably short period around the year 1926, when 
the long period of guessing turned into the successful building of the theory, starting a 
golden age of remarkable discoveries. In a similar way we can put a date to the beginning 
of a particular branch of quantum theory the theory of entanglement. It is the year 1935, 
when both, the paper of Einstein, Podolsky and Rosen ("EPR" 1 ), and (motivated by the 
EPR paper) Schrodinger's article 2 in which he coined the term "verschrankter Zustand" 
(entangled state). 

Einstein and Schrodingcr had both made crucial contributions to the development of 
quantum theory yet they were both expressing a deep dissatisfaction with the "present 
situation of quantum theory" (Schrodinger's title). And both articles were dismissed by 
some of the younger generation as the grumblings of old men who were just not able to 
follow the new lines of thought. Bohr's reply 3 to the EPR paper, although little more than 
a refusal to accept the problem, was hailed as a conclusive rebuttal, and almost everybody 
went back to business. In hindsight, however, one must admit that Einstein was struggling 
with the deepest departure from classical physics contained in quantum physics: not the 
discrete "jumps" and other such conspicuous features, but entanglement. And he was 
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2 BELL INEQUALITIES AND ENTANGLEMENT 

remarkably lucid, even though we may not share his conclusions. 

There arc perhaps two reasons in the EPR paper itself which led to its rather delayed 
impact on the physics community. One was that the concern of "completeness" , seemed like 
something to worry about in the distant future, for a community buzzing with successful 
applications of the theory. The other was that the example seemed a bit contrived: the 
state discussed is a highly singular object, and even now there are papers trying to make 
mathematical sense out of it. As a result the perfect correlation between results at two 
distant locations, which was a crucial part of the argument was not even rigorously true. 
This defect was overcome in Bohm's version of the argument 4 , using spins ("qubits"). The 
purely meta-theoretical appeal was changed dramatically by John Bell 5 in 1964, by the 
observation that the EPR dilemma could be formulated in the form of assumptions, which 
naturally led to a falsifiable prediction. It is hardly possible to underrate the importance 
of this discovery, which made it possible to rule out not just a particular scientific theory, 
but the very way scientific theories had been formulated for centuries. 

The history of experimental verifications of the violations of Bell's inequalities, pre- 
dicted by quantum theory, is essentially the story of building efficient sources for entangled 
systems. A breakthrough, bringing the first reliable violations of the inequalities, was Alain 
Aspect's atomic cascade 6 , which used the then relatively new technology of optical pump- 
ing. The search of good sources of entangled systems has become much more intensive 
with the advent of quantum information theory, in which entanglement is a key resource. 
The emphasis has thus shifted from demonstrating entanglement to using it, and experi- 
mental violations of Bell's inequalities are often merely a first check whether the source is 
working properly. New sources (e.g., using parametric down conversion) now admit Bell 
experiments in the student lab. Even the infamous "detection loophole" (related to the 
fact that only a small fraction of the produced pairs are really detected) is rapidly being 
closed now 7 . 

On the theoretical side, "violation of Bell's inequalities" had become synonymous with 
"non-classical correlation", i.e., entanglement. One of the first papers in which finer dis- 
tinctions were made was the construction of states with the property (Ref. 8 , see Sec. 4. 2. 
below) that they satisfy all the usual assumptions leading to the Bell inequalities, but can 
still not be generated by a purely classical mechanism (are not "separable" in modern 
terminology). This example pointed out a gap between the obviously entangled states (vi- 
olating a Bell inequality) and the obviously non-entangled ones, which are merely classical 
correlated (separable). In 1995 Popescu 9 (and later 10 ) narrowed this gap considerably 
by showing that after local operations and classical communication one could "distill" 
entanglement, leading once again to violations, even from states not violating any Bell 
inequality initially. Similar examples were then constructed by Gisin 11 even for the case 
of two qubits. To summarize this phase: it became clear that violations of Bell inequali- 
ties, while still a good indicator for the presence of non-classical correlations by no means 
capture all kinds of "entanglement". 

The natural conjecture in this situation was that "violation of some Bell inequality after 
suitable distillation" might be synonymous with entanglement, i.e., distillation should be 
possible for every non-separable state. But in 1998 the Horodecki family 12 constructed 
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counterexamples, the so-called bound entangled states (see Sec. 4. 3.). Due to a certain 
property (the positivity of the partial transpose) these states turned out to satisfy any 
of the known Bell inequalities 13,14 . Up to now, for the bipartite case, it is neither clear 
whether the violation of a Bell inequality already implies distillability nor do wc know 
whether there is any Bell inequality, which is violated by a state having positive partial 
transpose. For multipartite systems, however, the structure of the state space with respect 
to entanglement properties is much richer and Diir 15 recently showed that there exist 
indeed undistillablc multipartite states violating a Bell inequality. 

The purpose of this article is to give a theoretical review of the derivation of Bell in- 
equalities from classical assumptions, discuss their quantum violations and to illuminate 
relations to entanglement properties and quantum information theory in general. More- 
over, we emphasize the connection with convex geometry in the appendix. 

As nowadays new papers concerning Bell inequalities or closely related topics are posted 
on the Los Alamos e-print archive almost every day this will by no means be an exhaustive 
discussion. We will for instance disregard related topics like the Kochen-Specker theorem, 
nonlocal hidden variable theories, experimental implementations, and the "Bell theorem 
without inequalities 16 " . However, these restrictions will enable us to give an otherwise 
rather self-contained review of Bell inequalities and entanglement. Other review like arti- 
cles and extensive discussions emphasizing different topics can be found in Ref. 17 ' 18,19 ' 20 . 

2. Derivations of the Inequalities 

There are many derivations of Bell inequalities in the literature. This may at first be a 
bit surprising for such a simple mathematical statement. However, the hard work in such 
a derivation is almost never mathematical but conceptual: if we want to draw far-reaching 
conclusions ruling out whole classes of theories, or ways of formulating natural laws, we 
have to analyze theories on a very general and abstract level in order to even state the 
assumptions of "Bell's Theorem" . Naturally, there are many ways to say what the really 
essential assumptions are, depending on philosophical taste and scientific background. 

However, in all derivations two types of elements can be identified 



locality 

no-signalling 
non-contcxtuality 



classicality 

hidden variables 

classical logic 

joint distributions 

counterfactual definiteness 

"realism" 



Since Bell's inequalities are found to be violated in Nature 7 , one of these two assumptions 
needs to be dropped. Quantum mechanics (in statistical interpretation) chooses locality, 
whereas hidden variable theories drop locality in order to retain a description by classical 
parameters. In either case, however, fundamental features of the pre-quantum way of 
describing the world are lost. 

2.1. Basic notation 
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Bell typo inequalities always refer to correlations between two or more "parties" or 
sites. It is helpful to imagine that the experiments at the sites are conducted by physicists, 
traditionally named Alice and Bob in the bipartite (two party) case. Each of the parties 
gets a particle (or "subsystem") from a common source, and makes a measurement on 
her/his subsystem. The basic object of the theory are the joint probabilities obtained in 
this way. We will denote a typical measured probability by 

P(o 2 ,6i \A 2 ,B X ) , (1) 

where after the vertical bar we write the devices used, in this case device A 2 by Alice and 
B\ by Bob, and before the bar we denote the particular outcomes: a 2 a possible outcome 
of the device A 2 and b\ an outcome of B\ . For simplicity, we will assume throughout that 
only finitely many outcomes are possible for each measurement. The collection of all these 
numbers are the basic raw data, we might call the correlation table. 

Of course, these data have to satisfy some constraints which follow already from the 
probability interpretation: P (a 2 , b\ \ A 2 , Bi) > 0, and all the probabilities in a particular 
setup (A, B) have to add up to 1: 

J2^(a,b\A,B) = l. (2) 

a . b 

An interesting role is played by the marginals, which we denote by 

lP(a\A,B) = Y / V(a,b\A,B) . (3) 

b 

These are the probabilities measured by Alice in a given setup (A, B). For general cor- 
relation tables such marginals might depend on the whole setup and, in particular, on 
the device B chosen by Bob. For example, the device B might be a transmitter with a 
particular input fed into it, and A might be a receiver. Then this dependence on B would 
be precisely what is required for Alice to 'get the message'. Note, however, that this 
usually requires some signal-carrying physical system to go from Bob to Alice, contrary 
to the basic description of the correlation setup ( "all parties get particles from a common 
source"). What we expect in a general correlation experiment (without communication 
between the parties) is the following no-signalling condition: 

Y^ P (a, b | A, B) = P (a | A) is independent of B (4) 

b 

and similarly for all other sites, in this case A instead of B. 

Before coming to the conditions leading to Bell inequalities we have to clear up two 
common misunderstandings concerning hidden variables and nonlocal effects. These two 
subsections can be skipped; the formal development continues in Sec. 2. 4. 

2.2. Hidden variables exist 



Reinhard F. Werner and Michael M. Wolf 5 

"Hidden variables" have a bad name in the physics community. Yet the question 
whether we can understand the observed quantum randomness as arising from our igno- 
rance of some underlying classical variables (this is what the term means) is a fundamental 
question, which must be addressed seriously if we want to understand quantum mechanics 
at all. 

There was an early argument by von Neumann 21 proving the non-existence of such 
variables. But von Neumann's proof was making heavy use of the quantum mechanical 
structure, so it was really convincing only for those who had already accepted the con- 
clusion. The unfortunate consequence was perhaps that some people began to think that 
constructing a hidden variable theory, and thereby contradicting the great von Neumann, 
was somehow a non-trivial achievement in itself. What we want to show here is that, quite 
to the contrary, it is trivial to construct such a theory. Moreover, this will allow us later 
to point out more precisely the price to be paid in all such theories, namely some kind of 
non-locality. 

The simplest classical structure from which all measured results can be obtained was 
already mentioned: it is the collection of correlation data itself. This is saying little more 
than that gathering statistical data is an activity completely within the domain of clas- 
sical probability. Thus in this "theory" , which is remarkable only for having explanatory 
power exactly equal to zero, the hidden variables are the data to be measured, and they 
are hidden in the same way that the future is. The hidden variable thus contains a com- 
plete description of the experimental setup, i.e., of the devices (A, B,C, . . .) chosen by all 
the parties? This feature marks so-called contextual or, more simply, "non-local" hidden 
variable theories. 

Contextuality is, of course, not always as blatant, especially in hidden variable theories 
focusing on dynamical laws (such as the Bohm/Nelson theory 22 ' 23 and its generalizations 24 ), 
where the "setups" are not apparent, but enter via a description of the measuring devices 
inside the theory. By far the most wide-spread hidden variable theory is the "individual 
state" interpretation of quantum mechanics, according to which some wave function is 
somehow attached to each individual system, and constitutes a "catalog of all expecta- 
tions" to be measured on the system. Technically, this is indeed nothing but the description 
of a hidden variable theory, although such statements can also be found by the Copenhagen 
Masters, who are not usually associated with hidden variable views. 

2.3. The Ping Pong Ball Test 

This shows that the temptation is very great to use a language, which is too naively 
classical. It is especially great when one has to explain the quantum world to a general 
public. This is the only excuse for the amount of confusing explanations one can find. 
Here is a simple guideline for spotting many of the misleading ones. 
Take any explanations of Bell inequalities or quantum non-locality, and substitute ping 



a We could also say that there is a separate probability space to be chosen for each experimental setup, 
although we can equivalently put them all in a single probability space declaring different setups to be 
independent. 
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pong balls for every quantum particle in the account. Then, if what the author is selling as 
paradoxical still remains true, he/she isn't telling you anything about quantum mechanics 
after all. 

Surprisingly many texts fail this test 6 and lead to "paradoxes" like: imagine a box 
containing a ping pong ball, which can be separated in two parts, without looking at the 
ball. The two parts are then shipped far apart from each other and after opening one box 
we then know "instantaneously" whether the ball is at the distant location or not. This 
is true, but hardly paradoxical, and certainly utterly useless for sending a message. 

2.4. Local hidden variable theories and the CHSH inequality 

To formalize the idea of a local hidden variable theory, let us explicitly introduce a 
hidden variable A which takes values in a space A. We assume that the systems sent 
to Alice and Bob (and maybe the others) are described by A in all details necessary to 
compute their response to any measurement, or at least to determine the probabilities of 
all such responses. Thus for any measuring device A of Alice, and any possible outcome 
a of this device we get a response probability function A i— ► xa{&, A). The source of the 
correlation experiment is characterized by the probabilities with which the different A 
occur, i.e., by a probability measure M on A. With these data we can thus compute all 
the correlations 

P (a, b | A, B) = jM(dX) XA (a, \) XB (b, A) . (5) 

We will say that a correlation table allows a local classical model, if it can be represented 
in this form. 

But should this not be always the case? After all, we have only written down, what any 
probabilist would write anyhow: there is a random variable \A for each device, and we are 
looking at the joint distribution as we should. But a comparison with the trivial contextual 
theories shows immediately where the additional locality assumption is: in principle the 
response probabilities xa(&, A) might also depend on the devices B, . . . chosen at other 
sites, and by excluding this dependence we have increased the demands on our model. 
This is precisely the modification leading to Bell's inequalities. 

The standard example of a Bell inequality is the Clauser-Horne-Shimony-Holt (CHSH) 
inequality 26 , which refers to correlation experiments with two ±1 valued observables on 
two sites. From the response probabilities xa{o>, A) (a = ±1) we form the mean value of 
the random variable a (given the hidden variable A) : 

S(A)=Xa(+1,A)-xa(-1,A), (6) 

and from these the correlation function 

B(A) = ^[oi(A)(6i(A) + 62(A))+02(A)(Si(A)- 62(A))] . (7) 



6 A subtle way of failing the ping pong ball test is to supplement the description by the statement that 
the properties of quantum particles (in contrast to those of ping pong balls or socks) remain "objectively 
undecided" until the measurement is made. Of course, this merely shifts the burden of explanation to 
that rather cryptic phrase. 
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We claim that B satisfies the pointwise inequality |B(A)| < 1: indeed, the extreme values 
are attained, when each a,i(\), 6j(A) is extremal, i.e., ±1. But then 223(A) is an even integer, 
and since 2fi(A) = 4 requires a\(\) = 6i(A) = 62(A) = 62(A) = —62(A), a contradiction, 
we must have 223(A) < 2. 

Since B is pointwise bounded, its expectation, the so-called Bell correlation, 

(3 = I M{d\)B{\) (8) 

is also bounded by unity. But j3 can be expressed directly in terms of the measured cor- 
relation table: Introducing the expectation values E(A, B) = J^ a &=±i a 6 P (a, 6 | A, B), 
the inequality |/3| < 1 becomes the CHSH inequality: 



E(A 1 ,B 1 )+E(A 1 ,B 2 )+E(A 2 ,B 1 )-E(A 2 ,B 2 ) < 1. (9) 



Of course, the existence of local models underlying this inequality is very reminiscent 
of the no-signalling condition. In fact, the locality of the classical model is precisely the 
condition that the no-signalling property persists even when we know the value of A, or 
the source has been upgraded to produce only one A. Conversely, given a local classical 
model, the no-signalling condition for the experimental correlation data is merely "locality 
property on average" . 

2.5. Deterministic models and classical configurations 

An important motivation for the search for hidden variables was to restore the deter- 
minism of classical physics or, using Einstein's famous metaphor, to allow God to quit 
gambling. We can easily formulate determinism as a requirement for a local classical 
model: the knowledge of A should not only allow us to predict the probabilities of out- 
comes, but the outcomes themselves with certainty. Thus we call a local hidden variable 
model deterministic, if the response functions take only the values and 1. It turns out, 
however, that this seemingly much stronger constraint on the model does not lead to 
sharper conditions on the correlation data 25 . 

The reason is that we can upgrade any non-deterministic model to a deterministic one. 
To do this we only need to incorporate the randomness in the measuring device into the 
hidden variable. Mathematically, we replace the hidden variable A by A = (A, £a,£b), 
where £a and £b are uniformly distributed random variables on [0,1], which are indepen- 
dent of each other and of A. We then set 

«(.,A) a x.(«. <*.«*«.)) = { I £ " ot l l r ;t, A) P*) 

and similarly for B. Obviously, this model is deterministic, and it is straightforward to 
check that it produces the same correlation table as (5). 

For a fixed value of the hidden variable A the response function Xa(o, A) in a deter- 
ministic model takes on the value one for one outcome a and vanishes for all the others. 
If we now consider an n-partite system, where each of the parties has the choice of m 
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u-valued observables to be measured, any of the nm observables thus divides the hidden 
variable space A into v pieces (which do not necessarily have to be connected). In this 
way A is build up of v nm regions A c , such that every region is characterized by a single 
classical configuration c, i.e., an assignment of one of the v outcomes to each of the nm 
observables (for the CHSH case for instance with (n, m, v) — (2, 2, 2) there are 16 classical 
configurations). This enables us to rewrite Eq.(5) in form of a sum (and analogous for 
more than two sites) 

X>(a,b\A,B) = V f M(dX) X A(a,X)xB(b,X) 

= ^2PcXA(a,c)xB(b,c), (11) 

c 

where p c — J. M(dX) is the probability corresponding to the classical configuration c. 
Locality is here expressed in the fact that the assignment of a value to an observable at 
one site does not depend on the observables chosen at other sites. 

In Sec. 3. and in the appendix we will see that classical configurations play a crucial role 
in the construction of Bell inequalities as the extreme points of the classically accessible 
region. 

2.6. Bell's Telephone 

We will call a Bell telephone any device, which enables Alice to send messages to Bob 
using only the correlations in the particles the two have obtained from a common source. In 
other words, this is precisely the device declared impossible by the no-signalling or locality 
conditions. In this section we will give an alternative proof of the CHSH inequalities, which 
emphasizes this communication aspect: we assume a rather weak "classicality" condition, 
and show that Bell's telephone will work, whenever the CHSH inequality is violated. In 
fact, we show that the quality of the transmission is directly related to the Bell correlation. 

We will discuss this in the framework of correlation experiments used in the previous 
sections. The framework contains no space-time aspects so we cannot say that the com- 
munication would be "supcrluminal" (for that, see Sec. 5. 7.). However, since we are free 
to move the partners arbitrarily apart, and the effect has nothing to do with distance, we 
can make the communication supcrluminal if we want. 

If we accept the experimental evidence of violations of the inequalities, and also uphold 
the causality principle, which forbids Bell's telephone, then something must be wrong with 
the classicality condition, which is the basic assumption of our proof. What we assume is 
that Bob has a joint measurement device, which simultaneously replaces the two devices 
he uses in the experiment for the Bell correlation. In this way, the violations of Bell's 
inequalities become a direct experimental verification of a well known feature of quantum 
mechanics, namely that there are observables which do not admit a joint measurement. 
It also implies the impossibility of other tasks such as cloning (copying) and teleportation 
(transmission of quantum information on a classical channel). 

Let us state the basic assumption in the given framework: Bob has a joint measuring 
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device B1&1B2 for his two observables, which produces pairs of outcomes (61,62) with 
probabilities ^1(0.1,61,62) — P (01,(61,62) | Ai,Bi8zB2). The defining characteristic of 
such a device is that the statistics of the outcomes is the same as for the single devices B\ 
and i?2, i-c for i = 1, 2: 

5^Pi(oi, 61, 62) = P(a*,6 2 |i4i,B 2 ) and (12) 

^^K,6i,6 2 ) = P(oi,6i|i4*,Bi). (13) 

Having this kind of device Bob may guess what apparatus Alice has chosen by simply 
interpreting a coincidence of his outcomes 61 = 62 as Mi" and suspecting "A2" whenever 
they differ. If the probability p^ for Bob to be right is better than chance (Pq^ > |), 
then the two can clearly construct a Bell telephone. This, however, immediately takes 
place as soon as the CHSH inequality is violated since we can estimate 



Pok 



- E 

01,61,62 



61 +6 2 



4 s 

02,61,62 



2 
61 - 6 2 



ai|pi(ai,6i,6 2 ) 
02^2(02,61,62) 



> -. E ( & i + & 2)aiPi(ai,6i,6 2 ) 
01,61,62 

+ T E ( & 1 ^ ^2)02^2(02,61,62) 



4 

^2,61,62 



2 



(14) 



Hence the experimental fact that nature allows /? > 1 together with the no-signalling as- 
sumption rules out joint measuring devices. But this also forbids the existence of universal 
cloning and classical teleportation since these could be used to construct a joint measuring 
device 27 . 

3. All the Bell inequalities 

So far we have only discussed the CHSH inequality as one specific example of a Bell 
inequality. However, there is an infinite hierarchy of such Bell type inequalities, which can 
basically be classified by specifying the type of correlation experiments they deal with. The 
essential assumption leading to any Bell inequality is the existence of a local realistic model, 
which describes the outcomes of a certain class of correlation measurements. The modus 
operandi for the derivation of a class of Bell inequalities would therefore be the following: 
We first fix the type of correlation measurements we want to deal with - say we consider 
n-partite systems, where each of the parties has the choice of m w-valued observables to be 
measured ? Then we consider the space spanned by the entire set of the raw experimental 

c Of course we are free to require further restrictions, e.g. wc might just want to look at a subset of all 
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data, i.e., the (mv) n probabilities, and ask for the inequalities, which bound the region 
that is accessible within the framework of a local realistic model. Whatever this underlying 
model looks like, if only it is "classical" , i.e. a local hidden variable model, the accessible 
region will be contained in a convex polytope, whose extreme points are the classical 
configurations (for the connection to convex geometry see the appendix). The classical 
region is thus bounded by a finite albeit huge number of linear inequalities. These are the 
natural generalizations of the original inequality John Bell published in 1964 5 . 

Hence we are faced with a whole hierarchy of inequalities. The task of finding a minimal 
set of these inequalities, which is complete in the sense that they are satisfied if and only 
if the correlations considered allow a local classical model, is however closely related to 
some known hard problems in computational complexity 28,29 . So it is not surprising 
that complete solutions only exist either in cases, where additional symmetries can be 
exploited 14 ' 30 , or for small values of (n, m, v) where we can utilize today's computing power 
for a brute force approach. An extensive numerical search for the cases (n, m, v) — (3, 2, 2) 
and (2,3,2) was performed by Pitowsky and Svozil 31 . Unfortunately, however, the result 
of such a search is typically a list of the coefficients of thousands of inequalities 32 , from 
which generalizable insights cannot easily be extracted. 

Various aspects of the hierarchy of Bell inequalities have been investigated. Garg 
and Mermin 19 for instance have resumed the idea of Bell and discussed systems, with 
maximal (anti-) correlation, and v > 2. Gisin 33 investigated setups with more than two 
dichotomic observables per site (however for arbitrary states). Another closely related 
subject of interest, which we will however not discuss, is to expose the non-local (or non- 
classical) character of nature without making use of Bell- type inequalities (cf. Ref. 34 ' 35 ). 
In the following we will again briefly discuss the CHSH 26 inequalities as the first complete 
set of inequalities for the case (n,m,v) — (2,2,2). Then we recapitulate the complete 
characterization for the multipartite case (n, 2, 2) 14 ' 30 7 where additional symmetries enable 
us to give a rather exhaustive discussion. 

3.1. The CHSH case as a complete set of inequalities 

The CHSH inequalities are by far the best studied case of Bell inequalities. In this 
case, the characterization is complete in the following sense, first established by Arthur 
Fine 25 . 

The following conditions on a correlation table for two parties with two dichotomic 
observables each ((n, m, v) = (2, 2, 2)) are equivalent: 

1. The correlation table allows a local realistic model in the sense of Eq. (5). 

2. The CHSH inequalities hold, i.e., Eq. (9) holds, also when any observables at one 
site or outcomes of a any observable are interchanged. 

3. There exists a joint probability distribution for the outcomes of the four observables, 
which returns the measured correlation probabilities as marginals. 



possible correlations or restrict to a special class of observables or systems we want to investigate. 



Reinhard F. Werner and Michael M. Wolf 11 

We have shown above that 1 and 3 are equivalent, and imply 2. Hence the non-trivial 
"completeness" part is to show that 2 implies the others, which can be done by analyzing 
the polytope discussed above? 

Item 3 points to another interesting generalization of the CHSH inequalities. Given 
a set of probability distributions Pi,...,Pfc, there always exists joint distribution that 
returns them as marginals, namely the product distribution Yii=i Pi- However, if we fix in 
addition the joint distributions P^ for a certain subset of pairs (i, j), best visualized as a 
graph, a joint distribution with these marginals in general no longer exists. Bell inequalities 
thus appear as the obstructions to extending partial joint probability distributions. 

3.2. All multipartite correlation Bell inequalities for two dichotomic observ- 
ables per site 

n-particle generalizations of the CHSH inequality were first proposed by Mermin 37 , 
and further developed by Ardehali 38 , Belinskii and Klyshko 40 and others 39 ' 41 . In these 
works the emphasis was to find just one inequality for every n. In this section we give a 
complete set, first constructed in Refs. 14 ' 30 . 

The data under consideration are the 2™ full correlation functions of an arbitrary 
n-partite system, with two dichotomic observables per site. Each of the 2 n different exper- 
imental setups is labeled by the choice of observables at each site. We will parameterize 
these choices by binary variables Sk € {0, 1} so that Sk indicates the choice of the ±1 
valued observable Ak(sk) at site k. A "full correlation function" is then the expectation 
of a product 

£(8) = E(j[A k (8 k )), (15) 

k 

where the bit string s = (si, . . . , s n ) labels the respective experimental setup. Hence £(s) 
can be considered as a component of a vector £ in the 2™ dimensional space spanned by 
the experimental data, and any Bell inequality is therefore of the form 

£>(*)£(*) <1, (16) 

s 

where we have normalized the coefficients /3(s) such that the maximal classical value is 
1 (i.e., for the CHSH case /3 = (|, |, |,— |)). The linear combination in Eq. (16) may 
also be computed under the expectation value, so that this inequality can be stated as an 
upper bound to the expectation of 

n 

B = £/J(«)II^(**)- (17) 

s fe=l 

We will call such expressions Bell polynomials. They can be used directly in the quantum 
case as Bell operators, where all variables A^^Sk) are substituted by operators with — 1 < 

d Fine originally conjectured that the the CHSH inequalities (written out for all possible choices of observ- 
ables) might even be a complete set for m > 2. However, Garg and Mermin 36 provided a counterexample 
with (n,m,v) = (2,3,2). 
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Ak{sk) < 1, acting in the Hilbert space of the fc-th site, and the product is taken as the 
tensor product. 

For the construction of a complete set of inequalities it suffices to consider the extremal 
cases, i.e., classical configurations, where each of the observables takes on one of its two 
values with certainty. The restriction to full correlation functions, i.e., disregarding corre- 
lations of less than n sites, then enables us to exploit the invariance of £ under swapping 
the values of both observables on two sites. It is basically this symmetry, which leads to the 
fact that any of the 2 2 binary vectors / € { — 1, l} 2 with components f(r), r € {0, l} n 
corresponds to one Bell inequality via 

/?(,) = 2-"£/(r)(-l)<^\ (18) 

r 

where (r, s) = 5Zj=i r * s * denotes the inner product. Moreover, Eq. (18) indeed provides a 
set, which is complete in the sense that the considered correlations allow a local classical 
model if and only if all these inequalities are satisfied. 

Surprisingly, these 2 2 linear inequalities are equivalent to a single non-linear inequalityf 
namely 

£|£(r)|<l with £>) = 2-]T£( s )M)^>. (19) 

r s 

This is the characterizing inequality of a hyper- octahedron in 2™ dimensions. Hence the 
classical accessible region in this case has surprisingly high symmetry 14 , which is unfortu- 
nately not a symmetry inherent to the underlying problem. One should thus not expect 
to find an analogous structure for other cases of (n, m, v). 

4. Quantum states with no violation 

The violation of Bell's inequality was the first mathematically sharp criterion for en- 
tanglement. In this section we describe cases in which this criterion fails to detect any 
entanglement, even though in some of these cases the quantum state may be "entangled" 
according to now current terminology. 

4.1. Separable states 

Generally, entanglement is defined in terms of its negation, and a quantum state is 
said to be unentangled, separable or classically correlated iff it can be written as a convex 
combination of product states. 

Let p be a density matrix corresponding to a composite quantum system described on 
the Hilbert space Tt = Tv 1 ' ^Tv 2 ' . Then by definition a separable state can be written as 

p=Ew?W 2) ' ( 2 °) 



e The possibility of replacing the set of linear inequalities by a single nonlinear one was apparently first 
recognized by Circlson for the CHSH case in Ref. 63 . 
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where the positive weights pj sum up to one and p^ describes a state on W'*'. The 
terminology "classical correlated" is justified due to the fact, that the preparation leading 
to the correlations can be assumed to be classical in the following sense. Suppose we have 
two independent preparing devices, one for each subsystem, which prepare a certain state 
Pj depending on some classical input j. Then in order to obtain a state of the form (20) 
we have just to add a random generator, which produces numbers j with probability Pj. 
Combining these three devices thus leads to such a "separable" state and the expectation 
value of two observables A^\ A^ is then given by 

tr(/M« ® A^) = J2pMpf )Ail) MP? A{2) )- ( 21 ) 

j 

The correlation thus just depend on the random generator, which however can be chosen 
to be a purely classical device/ Moreover, it is obvious from Eq. (21) that any separable 
state admits a description within the framework of a local classical model (as described in 
Sec. 2. 4.) and therefore satisfies any Bell inequality. The following subsection is concerned 
with the fact that the converse however is not true. 

4.2. U <E)U invariant states 

They key idea in Ref. 8 in order to circumvent at least to some extent both difficulties, 
the construction of a local classical model and the proof of non-separability, is to make 
extensive use of symmetries. The states considered are those commuting with all unitaries 
of the form U <8> {/ and can be written as 

P(p) = (l-P)—+P—, 0<p<l, (22) 

r + r_ 

where P + (P_) is the projector onto the symmetric (antisymmetric) subspace of C ® C 
and r± = trP± = d ^ d are the respective dimensions. It has been shown that states of 
the form (22) are separable iff p < | independent of the dimension of the system. 

Now, consider a von Neumann measurement is performed on each of the two subsystems 
(t = l,2): 

AW=5>WQW with £QW = 1, (23 ) 

where the Qjl are one-dimensional orthogonal projectors. A description within a local 
classical model then would require that there exist a measure M on a probability space 
A 3 A and response functions X^(M>^) > (with X^uX (/•*> A) = 1) f° r an Y observable, 
such that 

tr( P Q {1) ® Ql 2) ) =|M(dA) X (1) Cu,A)x (2) OAA). (24) 

For A being the unit sphere {A G C | |A| = 1} and the choice 

X (1) (M) = (A, QW A) , 

^We note that classical correlation does not mean that the state has actually been prepared in the manner 
described, but only that its statistical properties can be reproduced by a classical mechanism. 
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y(2) (l/)A) = J 1. (A,Qi 1) A)<(A,Q| 1 1) A)V Ai ^z/ 
\ 0, else 

it can be shown that Eq.(24) indeed holds for 

P=l-^1, (25) 

which is in any nontrivial case larger than one half and thus corresponds to an entangled 
state. For increasing dimension Eq.(25) even approaches 1, which is (within the family 
of U <g> U invariant states) as far removed from the classically correlated state (p = |) as 
possible. 

4.3. PPT states 

Another class of states for which it has been shown 13 ' 14 that none of the inequalities 
discussed in Sec. 3. are violated is the class of states having a positive partial transpose 
(PPT states). Peres 42 even conjectured that these states in general admit a description 
within a local classical model. Note that without additional assumptions the converse, 
however, is not true since the state (25) discussed in the previous subsection admits such 
a local classical description although it is no PPT state. 

The partial transpose A Tl of an operator A on TL = Ti\ <8> H.2 is defined in terms of its 
matrix elements with respect to a given basis by (k£\A Tl \mn) = (m£\A\kn) , and p is said 
to be a PPT state if p Tl > 0. 

The key idea for showing that PPT states satisfy any inequality coming from Eq.(18) 
is to utilize the positivity of the partial transpose by applying the variance inequality 

tr(pB) 2 = tr(p Tl £ Tl ) 2 < tr(p T ^B T ^), (26) 

where B is the Bell operator, whose expectation value has to be bounded by one within 
a local classical model. Additionally averaging over all partial transposes with respect 
to any partition of the multipartite system into two subsystems then shows that in fact 
tr(pB) < 1. 

Though the positivity of the partial transpose is known to be one of the most efficient 
separability criteria 43 it is in general not a sufficient one. Hence there exist states, which 
are not classically correlated although they satisfy the PPT condition and therefore ad- 
mit a (however possibly restricted) local classical description 12 . These states are often 
referred to as PPT bound entangled states since they have the additional property that 
their entanglement cannot be recovered by entanglement distillation 44 . 

5. Quantum violations: The CHSH case 

5.1. The See-Saw iteration 

The derivation of the maximal quantum violation of a Bell inequality for an arbitrary 
state is a high dimensional variational problem for which we are not aware of any explicit 
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solution except for the case of the CHSH inequality for two qubit systems. Therefore we 
begin with providing a simple iterative algorithm, the See-Saw iteration, which turned out 
to be an efficient method for maximizing an affine functional with respect to hcrmitian 
operators with — 1 < A < 1 corresponding to the expectation of a Bell operator with 
dichotomic observables. Since generalization to the multipartite case is straight forward, 
we content ourselves with bipartite systems where the functional to be maximized is of 
the form 

tr (Bp) = ^ tr (Pij A, B j p) , (27) 

i,3 

where it suffices to consider unitary observables A = A* = A^ 1 (and the same for B), as 
these are extremal in the convex set of Hcrmitian operators with — 1 < A < 1. The idea 
is now to maximize this functional with respect to observables on one site while keeping 
the other ones fixed, and then to iterate this procedure. Therefore, we rewrite Eq.(27) by 
taking the partial trace over that site of the system, which we keep fixed, i.e.: 

ti (Bp) = ^tr A (X l A l ) =^ti B (Y J B J ) with (28) 

i 3 

Xi = tr B ( J2 0U i 1 ® B j)p) ( 29 ) 

3 

Yj = \XA ( Yl &3 ( A i ® 1 )'°) ( 3 °) 

i 

The maximization in Eq.(28), however, can be made explicit just by taking Ai = sign(Xi) 
resp. Bj = sign(y,). Of course, the See-Saw iteration is faced with the usual problem of 
most numerical optimization methods: it cannot guarantee the convergence on absolute 
extrema. Nevertheless it turned out to be a useful tool in the search for Bell violations 
(e.g. in Ref. ), which converges already after a few iterations and is in general very stable 
with respect to variations of the initial values. 

5.2. Cirelson's inequality 

The best upper bound for the violation of the CHSH inequality, first derived by 
Cirelson 45 , is obtained by squaring the Bell operator and utilizing the variance inequality 46 , 
which already appeared in the previous section in Eq.(26). Taking again into account that 
it suffices to consider unitary observables of the form A = A* = A^ 1 we get 

B 2 = 1-^[A 1 ,A 2 ]®[B 1 ,B 2 ]. (31) 

Since the commutators are bounded by two as ||L4, B]|| < 2||A|| ||B|| this leads to 

tr(pB) < V2, (32) 

which is usually referred to as Cirelson's inequality. 

5.3. Operators for maximal violation 
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The bound || [A, B]\\ < 2\\A\\ \\B\\ used in the previous section is clearly saturated when 
A and B are Pauli matrices. It is therefore no surprise that experiments coming close to 
the maximal violation (3 w \/2 are possible with qubit systems, and indeed this is precisely 
the idealized description of Aspect's experiments. In this subsection we argue (following 
Ref. 47 ) that the qubit example is even the only possibility to get the maximal violation in 
any dimension. 

To give a more precise, but simple statement, suppose that both the restricted density 
operators arc faithful (have no zero eigenvalues), and suppose A\, A2, B\, B 2 are operators 
giving /3 = \[2. Then A\,A 2 and A3 = iA 2 A\ satisfy all the algebraic relations of the 
Pauli matrices: A\ — 1, A\A 2 — iAj, and cyclic permutations thereof. 

The basic idea of the proof is to write the inequalities in terms of the two operators 

A = ^(A 1 + iA 2 ) (33) 

B = ^=[(B 1 +B 2 )+ l (B 1 -B 2 )] (34) 

which allow a simple representation of the Bell operator as B — \/2(A*B + AB*). One 
readily checks that, on the other hand, A* A + AA* < 1 and B* B + BB* < 1. The core 
of the proof is the decomposition 

1-B/V2 = (A-B)*(A-B) + (A-B)(A-B)* + 

+ (1 - A\) + (1 - A\) + (1 - B\) + (1 - B\) . (35) 

Clearly, the right hand side is a positive operator, which provides an alternative proof of 
Cirelson's inequality (32). States with maximal violation are those for which this equation, 
and hence every single term on the right hand side has expectation zero. In particular, 
for any vector 4> in the support of the density operator we get (A — B)Q = (A* — B*)Q = 
(1 - A\)§ = • • • = 0. Hence {A 2 + A* 2 )<S> = {l/2)(A\ - A 2 2 )§ = and (B 2 - B* 2 )$ = 
(i/2){Bl-Bl)<b = 0. Combining this with A 2 <5> = B 2 $ and A* 2 § = B* 2 $ we get A 2 $ = 
and similarly (A*A + AA*)Q = $ for every vector $ in the support of the density operator. 
Since the reduced density operator for Alice has full support, this implies the identities 
A*A + AA* = 1 and A 2 = 0, so that A x = A + A*,A 2 = i{A* -A), and A 3 = A* A- AA* 
are a realization of the Pauli matrices. 

5.4. Qubits: Structure of the Bell operator 

It is obvious from Eq.(31) that as soon as the observables on only one of the two subsys- 
tems commute, the inequality is satisfied. We may therefore disregard the case A = 1 and 
for the case of two qubits restrict to observables of the form Ak(sk) = afc(sfc)<r, where a is 
the vector of Pauli matrices and a,k{sk) is a normalized vector in 1R . Furthermore, we can 
use the homomorphism between SU(2) and 5*0(3) and do a local unitary transformation 
such that the vectors belonging to the four observables all lie in the x — y plane, i.e.: 

Ak = o~\ sin at + o~ 2 cos ak (36) 
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With this choice of observables we get 

B 2 = 1 — sin(ai — a[) sin(a 2 — a4) CT 3 °3, (37) 

such that the four Bell states |$±) = -^(|00> ± |11» and |*±) = -^(|01) ± |10>) turn 
out to be eigenstates of the CHSH operator. Moreover, B has symmetric spectrum since 
(1(g) a^)B = — S(l®<73) so that we end up with a Bell operator which is up to local unitary 
transformations always of the form 49 

B = Ax(P $+ -iV)+A 2 (P*+ -iV), (38) 

where P. denotes the projector onto the respective Bell states and the eigenvalues have to 
satisfy A 2 + A 2 = 2, which follows from trZ? 2 = 4. 

In fact, the spectral decomposition in Eq.(38) exhibits a property of the Bell operator, 
which is typical for any multipartite inequality for two dichotomic observables per site (see 
Sec.6.2.). 

5.5. Qubits: Maximal violation for arbitrary states 

Let us now consider an arbitrary quantum state p of two qubits and let Rij — tr(pai <8> 
Oj). Following Rcf. 50 the maximal violation of the CHSH inequality is then given by 

/3(p) = max __ -(ai • R{a-2 + & 2 ) + a[ ■ /?(o2 — a 2 )) 

Ol,a^,02,Ct2 

= max i(||i?(a" 2 + a 7 2 )|| + ||i?(a* 2 - 4)11) 

02,a 2 

= max cos(/?||i?c|| + sini^||_Rc'| 



= maxy||Pcl| 2 + ||Pc'|| 2 , (39) 

c*J_c' 

where the maxima are always taken over all unit vectors. Evaluating the last maximum 
we obtain 



P(p) = WT^7, (40) 

where v, v' arc the two largest eigenvalues of the matrix R T R. For pure two qubit states, 
which can always be written in their Schmidt form as |\1/) = cos<^|00) + sin</?|ll) this can 
further be evaluated to 

/3(tt) = ^/l + sin 2 (2^), (41) 

which means that as soon as a pure two qubit state is entangled it violates the CHSH 
inequality 51 ? 

5.6. Continuous variable systems 



9 This result was generalized to higher dimensional bipartite systems by Gisin and Peres 52 . 
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There is recent effort in order to adopt the CHSH inequality to the continuous variable 
case 53,54,55,56 . One possibility in order to derive dichotomic obscrvablcs in this case is to 
utilize the apparent analogy between the parity operator in Fock space and the "spin- 
measurement" in C 2 associated to the Pauli 03 operato^ As admissible observables in 
the continuous variable case we may then either use the set of coherently displaced parity 
operators 53 or explicitly construct a direct analogue to the three Pauli operators 56 by 
establishing the isomorphism ^ 2 (IN) ~ C ® ^ 2 (IN) via collecting parities. Let 



53(|2n)<2n|-|2n+l)<2n + l|), (42) 

00 

£)|2n><2n+l|, *_=*;, (43) 



s+ = 

be the parity and parity flip operators in Fock state representation, then (s x , s y ,s z ) with 
s x ± is y — 2s± is again a representation of the Pauli matrices. So we can in principle make 
use of all the above results and Eq. (39,40) lead to a lower bound on the maximal violation 
of the CHSH inequality for continuous variable states. 

A special kind of such states are Gaussian states, i.e., states having a Gaussian Wigncr 
distribution, which play a crucial role in quantum optics, since coherent, squeezed and ther- 
mal states are all Gaussians. If we only consider observables given by the field quadratures 
or simple functions thereof, then the positive Wigner function itself provides a "hidden 
variable distribution". Hence no violation of a Bell inequality can occur. Observables 
obtained from Eq. (42,43), however, are not of that kind. For the pure Gaussian two mode 
state 

^ : ^ ^ : n tanh n (r) 

cosh(r) 



IVM) = 5>„|n) ® |n) , c n = —^ , (44) 



which is characterized by the squeezing parameter r, the maximal violation with respect 
to these observables is 

j3{r) = yi + tanh 2 (2r) (45) 

analogous to Eq.(41). 

5.7. Quantum Field Theory 

It is not surprising that Quantum Field Theory should contain violations of Bell in- 
equalities. After all, this theory is supposed to describe Aspects experiment, too. There 
are three special reasons, though to at this theory specifically. The first is that here one 
can take the locality assumption of the general derivations literally in the sense of Einstein 
causality. Thus Alice and Bob arc assigned space time regions in which they can perform 
their experiments, and these regions are chosen to be spacelike separated, so that relativis- 
tic causality forbids any signaling between the two. Since we are looking at a quantum 



h For a different approach see for instance Rcf. 54 , where the chosen observables distinguish between photon 
counts and no photon counts. 
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field theory, it is clear that quantum features arc also contained in the description from 
the outset. The best adapted framework for this discussion is "algebraic quantum field 
theory 57 " also called "Local Quantum Physics 58 " : here the concepts of quantum structure 
and relativistic localization are taken as the axiomatic starting point. 

The second feature making quantum field theory interesting is that we have here a 
distinguished state, the vacuum. As it is well known that there are always vacuum fluctu- 
ations, it is natural to ask whether these fluctuations are classical or not. More concretely: 
if Alice and Bob are in spacelike separated laboratories, can they get a violation of Bell's 
inequalities from vacuum fluctuations alone? It turns out 59 that they can, although the 
effect is extremely small for large spatial separation: in a massiv theory, it decreases ex- 
ponentially with the separation of the regions on the scale of the Compton wavelength. 
On the other hand, one gets maximal violation if the regions are very close 47 . 

The third reason to look a quantum field theory emerges from these studies: it turns 
out that not only the vacuum produces maximal violations at short range, but any state, 
which is not extremely singular* (e.g., requires only finite total energy). This is a new 
possibility arising only in theories with infinitely many degrees of freedom, and quantum 
field theory is an ideal testing ground for ideas about this phenomenon. 

We remark that it is still an open problem to show that even at arbitrarily large distance 
a (necessarily exponentially small) violation of the CHSH inequality can be detected. What 
is known so far is only that the positive partial transpose property always fails 60 , and hence 
the vacuum is not classical at any distance. 

6. Quantum Violations: Beyond CHSH 

6.1. Bipartite systems with more than two observables 

For the case of more than two dichotomic observables per site only little is known. In 
particular there is yet no explicit characterization of the extremal inequalities, although 
constructing some inequalities, e.g. by chaining CHSH inequalities 61 , is not difficult. 
However, Cirelson 62 ' 63 recognized that the quantum correlation functions, which are in 
general rather cumbersome objects, can be rccxprcsscd in terms of finite dimensional 
vectors in Euclidean space. 

If we have two sets of observables {Ai(s)} and {^(i)} (again hermitian with — 1 < 
A < 1) where s € {1, . . . ,p} and t G {1, . . . , q}, then for any state p there exist sets of real 
unit vectors {x s } and {yt} in the Euclidean space of dimension q + p such that 

tv(pA 1 (s)®A 2 {t)) = (x s ,y t ) V Bft . (46) 

For the case of two observables on one site and an arbitrary number on the other Cirelson 
showed that the maximal quantum violation is y/2 and thus already obtained for the CHSH 
inequality (p = q = 2). For an increasing number of observables on both sites, however, 
he obtained the Grothendieck constant (~ 1.782), known from the geometry of Banach 
spaces, as a limit for the maximal violation. 

1 Technically speaking: any state which is locally normal with respect to the vacuum 
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6.2. The multipartite case - the role of GHZ states and Mermin's inequality 

Like the four Bell states are of particular importance for the CHSH operator, the 
generalized GHZ states 64 , which are up to local unitaries of the form 

|* G „z> = -5=(|00...0> + 111...1)), (47) 

play a special role for all multipartite inequalities with two dichotomic observables per 
site. In fact, they provide a basis of eigenstates for all the Bell operators with extremal 
observables and lead thus to maximal violations 49,14 . 

If we again consider observables of the form in Eq.(36), which are obtained after apply- 
ing local unitaries, and let fi € {0, 1}™ and its complement fl with Ct^ = 1 — ilk characterize 
vectors on the n-fold tensor product, then 



fe=i 



)A k {s k )\n) = /n(a)|fi), with (48) 

n 

fn(s) = exp[^a fe (s fe )(-lf fe ]. (49) 

fc=i 

Therefore the set of 2™ GHZ-like basis vectors |* n ) = -^{e i6n \Vi) + |H)) satisfies the 
eigenvalue equations B\^q) = AqI^q) if 6q is chosen such that 

Xu = e ie -J2(3(s)Ms) (50) 

s 

is real. Hence any Bell operator for multipartite systems with two dichotomic observables 
per site indeed admits spectral decomposition into GHZ states 49 , and it is an immediate 
corollary thereof that GHZ states lead to maximal violations. It was shown in Ref. 14 that 
the computation of the latter can be reduced to a variational problem with just one free 
variable per site. Moreover, even any extreme point of the convex body of the quantum 
mechanically attainable correlation functions is found in the generalized GHZ states 14 . 

It is crucial for the derivation of all the above results that we have no more than two 
dichotomic observables, since three or more vectors do in general not lie in a plane and we 
would therefore not be able to restrict to observables of the form in Eq.(36). 

If we now fix the state to be of the GHZ form (47) and ask for the inequality within 
the set (18) leading to the maximal violation, we obtain Mermin's inequality 37 -? For 
an n-partite system the Bell operator corresponding to Mermin's inequality is defined 
recursively starting with B\ = A\ by 

B n = %i ® (A n + A' n ) + %i (A n A' n ), (51) 



J In fact, Mermin 37 derived the inequality corresponding to Eq.(51) for odd n, whereas Ardehali 38 in turn 
obtained the one for even n. It was then Klyshko and Belinskii 40 , who recognized that Eq.(51) covers 
both types of inequalities (see also Ref. 41 ). However, since the basic idea is going back to David Mermin 
it seemed justifiable to us to refer to the inequality as "Mermin's inequality". 
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where B' n is obtained from B n by exchanging all the contained observables A «-► A' . 
Permitting arbitrary unrelated choice for the n — 1 partite Bell operators B n -\ and B' n _ l 
it was shown in Ref. 14 that any inequality of the set (18) can be obtained from Eq.(51), 
i.e., by nesting CHSH inequalities. Squaring the Bell operator as in Eq.(31) leads to 
tr(pB n ) < 2~ ~ , which is however only saturated for Mermin's inequality 14 . Hence the 
maximal violations grow exponentially with the number of subsystems. However, one has 
to keep in mind that the joint efficiency of n independent detectors would in turn decline 
exponentially in n. 

Finally we should mention that for the case of multipartite systems with an infinite 
set of settings, i.e. more than two dichotomic observables per site, Zukowski 65 derived an 
inequality for which the GHZ state leads to an exponentially growing violation of |(f) • 

7. Relations to Quantum Information Theory 

One of the essential innovations of quantum information theory is to think of entangle- 
ment as a resource for quantum information processing purposes. This new point of view 
led to a dramatic increase of knowledge about the structure of the state space with respect 
to entanglement properties. Whereas in the late eighties there was hardly any difference 
between entangled states and states violating some Bell inequality, we have a much more 
subtle discrimination nowadays. 

Figure 1 summarizes the previously discussed relations between various degrees of 
classicalness, i.e., negations of entanglement properties. The implication "PPT =>■ Bell 
inequalities" thereby means, that this holds for all inequalities of the form in Eq.(18). In 
other words, there exists a local hidden variable model if we only consider full correlation 
functions of two dichotomic observables per site. Since no counterexample could be found 
for other cases so far one might follow Asher Peres 42 and conjecture that positivity of the 
partial transpose generally implies the existence of a local hidden variable model. 

Other open problems are associated with the distillability of entangled states, i.e., 
the possibility of extracting maximally entangled states by means of classical communi- 
cation and local operations on several copies of the input state. It is for instance still an 
open question whether positivity of the partial transpose is necessary for undistillability 66 . 
Moreover, it is yet not clear whether the violation of a bipartite Bell inequality already 
implies distillability. For multipartite systems, however, the structure of the state space 
with respect to entanglement properties is much richer and Dur 15 recently showed that 
there exist indeed undistillable multipartite states' violating Mermin's inequality. 



fc A different way of deriving Mermin's inequality is obtained by identifying the expectation values of 
observables A and A' of a single site with a square in the complex plane. After a suitable linear trans- 
formation (a 7r/4 rotation and a dilation) we can take it as the square S with corners ±1 and ±i. The 
pair of expectation values of A and A' is thus replaced by the single complex number tr(pa), where 
a = i ((A + A') + i(A' - A)) = e^ i7T / 4 (A + iA')/y/2. The basic idea behind this transformation is that 
products of complex numbers lying in S again lie in S. Since this also holds for convex combinations and 
pure states within a local classical model are always of a product form, the statement that the product 
of such complex expectations lies in S indeed corresponds to a Bell inequality. In fact, this is essentially 
Mermin's inequality (see Ref. 13 ). 
'The example in Ref. 15 makes use of at least eight parties. However, the same techniques can be applied 
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Separability 



Undistillability 



LHV model 




Fig. 1. Relations between various degrees of "classicalness" 



Of course Fig. 1 is far from being complete. Relations between Bell inequalities and 
usefulness for teleportation 67 , quantum key distribution and quantum secret sharing 6 ^' 69 
have also been studied. Scarani and Gisin 69 for instance have recently argued that in a 
secret sharing protocol the authorized partners have a higher mutual information than 
the unauthorized ones, iff they could violate Mermin's inequality. Such a connection has 
already been suggested by the coincidence that for two-partner key distribution a secret 
key can be established using one-way privacy amplification iff the two partners can violate 
the CHSH inequality 70 < 71 . 

In this way Bell inequalities seem to appear in a new context, playing the role of 
witnesses for the usefulness of a state for certain quantum communication purposes. 

However, the resource point of view of entanglement also requires a quantitative de- 
scription, which tells us how much entanglement is present in a given quantum state. So 
why not take the maximal violation of a Bell inequality as a measure of entanglement? 
Although, this seems to be quite reasonable at first, the works of Popescu 9 and Gisin 11 
have shown that the maximal violation is in fact not capable of measuring entanglement. 
By definition entanglement is that part of correlations between several subsystems, which 
is not "classical" . Therefore a measure of entanglement has to be able to distinguish this 
non-classical part from classical correlation, which can increase under local operations and 
classical communication (LOCC). However, Ref. 9,11 have shown in an impressive way, that 
the maximal violation of a Bell inequality does not behave monotonously under LOCC 
operations. Therefore Bell violations merely give a hint for the strength of entanglement 
but they do not fulfill the usual requirements for entanglement measures. 



for 4-partite systems. 
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Appendix A 

8. Bell inequalities and convex geometry 

In this appendix we return to the construction of Bell inequalities, which we started to 
discuss in Sec. 3., and emphasize that finding all the Bell inequalities is a special instance 
of a standard problem in convex geometry known as the convex hull problem. This point 
of view provides a rather intuitive geometrical interpretation of Bell inequalities as well as 
making contact to one of the oldest fields in mathematics 72 . 

Let us again consider an n-partite system, where each party has the choice of m w-valued 
observables to be measured. Note that we choose such a symmetric setting just in order 
to circumvent cumbersome notation - the basic idea, however, is obviously independent 
of the type of considered correlations. 

Hence, we have m™ different experimental setups and each of them may lead to v n 
different outcomes, such that the raw experimental data are made up of (mv) n probabili- 
ties. These numbers form a vector £ lying in a space of dimension (mv) n (minus a few for 
normalization constraints), to which wc will refer to as the correlation space. 

Now, we ask for the region S in this correlation space, which is accessible within 
any local classical model. The crucial characteristic of these models is that any vector 
£ is generated by specifying probabilities for each classical configuration, i.e., for every 
assignment of one of the v values to each of the nm observables. Locality is thereby 
expressed in the fact that the assignment of a value to an observable at one site does not 
depend on the choice of observables at the other sites. 

Since every classical configuration c is also represented by a vector e c of probabilities, 
the classical accessible region is just the convex hull of at most t>( nm ) explicitly known 
extreme points: 

S = conv{e c }. (A.l) 

Since the numbers of configurations is finite S is a convex polytope. 

8.1. Representations of convex polytopes and the hull problem 

Every polytope has two representations. It can either be expressed in terms of a 
finite number of extreme points ( V representation) or as the intersection of halfspaces (H 
representation) , i.e., as a set of solutions to a system of linear inequalities - which in our 
case are the Bell inequalities. The set of linear inequalities corresponding to all halfspaces 
containing S is represented by the set 

~* = {/3|Vc:(/3,e c )<l}, (A.2) 

called the polar of S, which is in turn a convex polytope of the same dimension. The 
duality between a convex set and its polar is a generalization of the duality between regular 
platonic solids, under which the octahedron and the cube as well as the dodecahedron and 
the icosahedron are polars of each other. 
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It is obvious from the convexity of S that for each (3 € S* the inequality (/?, £) < 1 is 
necessary for (eS. Moreover, the Bipolar theorem 73 says that the collection of all these 
inequalities is also sufficient and since S* is also convex it suffices to look at the extreme 
points of S*. We are therefore left with the following problenrThiown as the convex hull or 
face enumeration problem: given the extreme points of a polytope find the extreme points 
of its polar. If there are initially more points than extreme points, then the convex hull 
problem is said to be degenerate. 

8.2. The number of facets 

The total number of Bell inequalities for a given class of correlations is not known except 
for the case described in Sec. 3. However, convex geometry provides a number of general 
results and upper bounds, which we will just briefly discuss. The most well-known result 
in the combinatorial theory of polytopes is probably the Euler-Poincare 74 relation stating 
that the numbers {/.,} of faces'bf dimension j are related via X^i=o {~^Y fi = 1 — ( — 1)"° > 
where D is the dimension of the polytope. Moreover, McMullen's 75 upper bound theorem 
implies that a polytope with N vertices has at most fn-i ~ 0(iVL D / 2 J) facets, which is 
in our case the number of Bell inequalities. This bound is also tight as exhibited by cyclic 
polytopes. However, the class of polytopes occurring in the construction of Bell inequalities 
is of a rather special type often referred to as 0-1 polytopes since the components of any 
extreme point e c arc either or 1. The question whether the number of facets of such 
polytopes is bounded by an exponential in D was just recently given a negative answer to 
by Barany and Por , who showed that there exists a positive constant c such that 

/ cD \ D / i 

t-'>bki>) ■ (A ' 3) 

for some 0-1 polytopes. Hence the growth can in general be superexponcntial. 

8.3. Complexity of convex hull algorithms 

There are several ways of measuring the complexity of a convex hull algorithm 77 . 
Basically, there are two points in which different approaches differ from each other: the 
elementary operations (bit operations vs. elementary arithmetic operations) and the role 
of the output (whether it is a parameter of the measure or not). 

One way would be just to count the number of elementary arithmetic operations and 
to assume that the storage of an integer number takes a unit space (which is essentially 
the unit cost RAM model as opposed to the Turing model). Fixing the dimension D of the 
polytope and then looking at the worst-case running time as a function of N an optimal 
algorithm is known 78 that runs in time 0(N^ D / 2 ^), as already suggested by McMullen's 
upper bound theorem. 



"See also the problem page http://www.imaph.tu-bs.de/qi/problems/l.html. 

n A subset of the polytope P is called a face if it is the intersection of the polytope with one of its supporting 
hyperplanes, i.e., a plane h such that one of the closed halfspaces of h contains P. The faces of dimension 
0, 1, D — 1 are called vertices (extreme points), edges and facets. Every face is again a convex polytope. 
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For the general nondegenerate convex hull problem there are algorithms with run times 
which are polynomially bounded by N,D and fu-i (e.g. 79 ). For the degenerate case 
however no such polynomial algorithm is known. 

For a more detailed discussion of the complexity of finding a complete set of Bell 
inequalities we would like to refer to the work of Pitowsky 28,29 , who also discusses the 
relation between the convex hull problem and the notorious NP = P resp. NP = coNP 
questions. It is shown, that deciding membership in a "correlation polytope" is an NP- 
complete problem, whereas deciding facets is probably not even in NP. Moreover, in Ref. 29 
the relation to the minimum energy problem for Ising spin systems is discussed, which in 
turn was shown to be NP-hard by Barahona 80 . 



